skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Nguyen, Thien H"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. We introduce two complementary techniques for efficient optimization that reduce memory requirements while accelerating training of large-scale neural networks. The first technique, Subset-Norm step size, generalizes AdaGrad-Norm and AdaGrad(-Coordinate) through step-size sharing. Subset-Norm (SN) reduces AdaGrad’s memory footprint from O(d) to O(sqrt(d)), where d is the model size. For non-convex smooth objectives under coordinate-wise sub-gaussian noise, we show a noise-adapted high-probability convergence guarantee with improved dimensional dependence of SN over existing methods. Our second technique, Subspace-Momentum, reduces the momentum state’s memory footprint by restricting momentum to a low-dimensional subspace while performing SGD in the orthogonal complement. We prove a high-probability convergence result for Subspace-Momentum under standard assumptions. Empirical evaluation on pre-training and fine-tuning LLMs demonstrates the effectiveness of our methods. For instance, combining Subset-Norm with Subspace-Momentum achieves Adam’s validation perplexity for LLaMA 1B in approximately half the training tokens (6.8B vs 13.1B) while reducing Adam’s optimizer-states memory footprint by more than 80% with minimal additional hyperparameter tuning. 
    more » « less
    Free, publicly-accessible full text available July 13, 2026
  2. Abstract Reaction of Tl(OTf) with 2 equiv of bis(diisopropylamino)cyclopropenylidene (BAC) in THF results in formation of [Tl(BAC)2(OTf)] (1) in moderate yields. Subsequent reaction of1with [K][H2‐9‐BBN] ([H2‐9‐BBN] = dihydrido 9‐boratabicyclo[3.3.1]nonane) in THF results in formation of [Tl(BAC)(μ‐H2‐9‐BBN)]2(3), also in moderate yield. Complex3is the first reported thallium borohydride. We attribute its thermal stability to the strong donor ability of the BAC co‐ligand. Both1and3exhibit trigonal pyramidal geometries about Tl+in the solid‐state, indicative of the presence of stereochemically active lone pairs. The hydride environment in3is calculated to exhibit a 3.9 ppm downfield shift attributed to spin‐orbit effects from the adjacent Tl center. 
    more » « less
    Free, publicly-accessible full text available July 24, 2026